A Data Placement Algorithm for Data Intensive Applications in Cloud

نویسندگان

Jucheng Yang

Qing Zhao

Congcong Xiong

Kunyu Zhang

Yang Yue

چکیده

Data layout is an important issue which aims at reducing data movements among data centers to improve the efficiency of the entire cloud system. This paper proposes a dataintensive application oriented data layout algorithm. It is based on hierarchical data correlation clustering and the PSO algorithm. The datasets with fixed location have been considered, and both the offline strategy and the online strategy for data layout have been given. As this proposed strategy is aimed at reducing the global amount of data transmissions, and the special permission of the datasets has been introduced, the cost of data transmission can be measured more reasonable. Simulation results show that compared with two classical strategies, our algorithm can reduce the amount of data transmission more effectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data Replication-Based Scheduling in Cloud Computing Environment

Abstract— High-performance computing and vast storage are two key factors required for executing data-intensive applications. In comparison with traditional distributed systems like data grid, cloud computing provides these factors in a more affordable, scalable and elastic platform. Furthermore, accessing data files is critical for performing such applications. Sometimes accessing data becomes...

متن کامل

Adaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments

Hadoop MapReduce framework is an important distributed processing model for large-scale data intensive applications. The current Hadoop and the existing Hadoop distributed file system’s rack-aware data placement strategy in MapReduce in the homogeneous Hadoop cluster assume that each node in a cluster has the same computing capacity and a same workload is assigned to each node. Default Hadoop d...

متن کامل

Communication-Aware Traffic Stream Optimization for Virtual Machine Placement in Cloud Datacenters with VL2 Topology

By pervasiveness of cloud computing, a colossal amount of applications from gigantic organizations increasingly tend to rely on cloud services. These demands caused a great number of applications in form of couple of virtual machines (VMs) requests to be executed on data centers’ servers. Some of applications are as big as not possible to be processed upon a single VM. Also, there exists severa...

متن کامل

Improving Data Availability Using Combined Replication Strategy in Cloud Environment

As grow as the data-intensive applications in cloud computing day after day, data popularity in this environment becomes critical and important. Hence to improve data availability and efficient accesses to popular data, replication algorithms are now widely used in distributed systems. However, most of them only replicate the static number of replicas on some requested chosen sites and it is ob...

متن کامل

Detection of some Tree Species from Terrestrial Laser Scanner Point Cloud Data Using Support-vector Machine and Nearest Neighborhood Algorithms

acquisition field reference data using conventional methods due to limited and time-consuming data from a single tree in recent years, to generate reference data for forest studies using terrestrial laser scanner data, aerial laser scanner data, radar and Optics has become commonplace, and complete, accurate 3D data from a single tree or reference trees can be recorded. The detection and identi...

متن کامل

A Cloud-Computing-Based Data Placement Strategy in High-Speed Railway

As an important component of China’s transportation data sharing system, high-speed railway data sharing is a typical application of data-intensive computing. Currently, most high-speed railway data is shared in cloud computing environment. Thus, there is an urgent need for an effective cloud-computing-based data placement strategy in high-speed railway. In this paper, a new data placement stra...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

A Data Placement Algorithm for Data Intensive Applications in Cloud

نویسندگان

چکیده

منابع مشابه

Data Replication-Based Scheduling in Cloud Computing Environment

Adaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments

Communication-Aware Traffic Stream Optimization for Virtual Machine Placement in Cloud Datacenters with VL2 Topology

Improving Data Availability Using Combined Replication Strategy in Cloud Environment

Detection of some Tree Species from Terrestrial Laser Scanner Point Cloud Data Using Support-vector Machine and Nearest Neighborhood Algorithms

A Cloud-Computing-Based Data Placement Strategy in High-Speed Railway

عنوان ژورنال:

اشتراک گذاری